AMD HIP Programming Guide: Architectural Foundations of the HIP Ecosystem

The HIP Ecosystem is architected as a thin abstraction layer designed for source-code compatibility between AMD and NVIDIA architectures. It leverages the ROCm (Radeon Open Compute) stack, specifically utilizing the Heterogeneous System Architecture (HSA) runtime and the Kernel Fusion Driver (KFD).

1. Initialization Bootstrap

Initialization begins with low-level kernel driver handshakes via hsa_init(0, ...) and hsaKmtOpenKFD(...). These establishment calls create the communication bridge between user-space applications and AMD GPU hardware.

2. Topology & Property Discovery

Before launching kernels, the runtime identifies hardware capabilities using hsaKmtAcquireSystemProperties and hsaKmtGetNodeProperties. It maps physical memory to GPU nodes using hsaKmtMapMemoryToGPUNodes, ensuring page-table visibility for the device.

3. The Compilation Pipeline

The bridge between CUDA and HIP is built on two pillars: hipify-perl (regex-based transpiler) and hipcc (compiler wrapper).

# Porting Workflow Example
hipify-perl square.cu > square.cpp
hipcc square.cpp -o square.out

4. Versioning Logic

Compatibility is enforced via a precise formula to ensure hipRuntimeGetVersion aligns with HSA extension tables:

$$\text{HIP\_VERSION} = \text{MAJOR} \times 10^7 + \text{MINOR} \times 10^5 + \text{PATCH}$$

TERMINAL bash — 80x24

> Ready. Click "Run" to execute.

QUESTION 1

Which tool converts CUDA (.cu) files into HIP-ready C++ (.cpp) files via regex mapping?

hipcc

hipify-perl

hsa_init

rocminfo

QUESTION 2

What is the primary purpose of the Kernel Fusion Driver (KFD) in this architecture?

To compile device-side ISA code.

To manage user-space to GPU communication and page table mapping.

To calculate the HIP_VERSION macro.

To provide high-level math intrinsics.

QUESTION 3

Given the formula, what is the value of HIP_VERSION for major version 6, minor 0, and patch 325?

600325

60032500

60000325

60032500 (6*10^7 + 0*10^5 + 325)

QUESTION 4

Which low-level HSA function must be called to establish the system-wide runtime context?

hsa_init(0, ...)

hsaKmtAllocMemory(...)

hsaKmtCreateEvent(...)

hsa_agent_iterate_is_pas(...)

QUESTION 5

What does hsa_system_get_major_extension_table do?

It links HIP API calls to the specific HSA implementation on the host.

It converts CUDA code to C++.

It allocates 4096-byte buffers.

It releases KMT system properties.